Progress
0%
📘 Microsoft Certification

AI-102 Azure AI Engineer
Associate — Full Course

Master all exam domains from scratch. Learn Azure AI services with code samples, architecture patterns, and 50 interactive practice questions with instant feedback.

5Core Modules
50Practice Q&A
20+Code Samples
700Pass Score /1000

📋
Exam At a Glance

FieldDetail
Full NameDesigning and Implementing a Microsoft Azure AI Solution
Duration120 minutes
Questions~40–60 (multiple choice, drag-and-drop, case studies)
Passing Score700 / 1000
Price$165 USD (varies by region)
ProviderPearson VUE (in-person or online proctored)
PrerequisitesAzure portal familiarity, REST APIs, Python or C#

📊
Exam Domain Weights

Plan & Manage Azure AI Solutions15–20%
Implement Computer Vision Solutions20–25%
Implement NLP Solutions20–25%
Knowledge Mining & Document Intelligence15–20%
Implement Generative AI Solutions15–20%
Module 1 · Section 1

Azure AI Services — The Big Picture

Understanding the Azure AI services catalog, resource types, tiers, and endpoints.

☁️
What Are Azure AI Services?

  • Azure AI Services = pre-built, cloud-hosted AI APIs from Microsoft — no model training required for most use cases
  • Two resource types:
  • Single-service: one endpoint, one key per service (e.g., Azure AI Vision only)
  • Multi-service (Cognitive Services): one key, one endpoint for ALL services — simpler billing
  • Always provisioned via: Azure Portal, Azure CLI, ARM templates, or Bicep
  • Pricing tiers: Free (F0) — limited quota; Standard (S0) — pay-per-call
  • Endpoint format: https://<resource-name>.cognitiveservices.azure.com/
Key Services in Scope
  • Azure AI Vision
  • Face API
  • Azure AI Language
  • Azure AI Speech
  • Translator
  • Azure OpenAI
  • Azure AI Search
  • Document Intelligence
Common CLI Commands
  • az cognitiveservices account create
  • az cognitiveservices account keys list
  • az cognitiveservices account keys regenerate
  • az cognitiveservices account show
The multi-service resource uses the generic endpoint cognitiveservices.azure.com. Individual services (Vision, Language) have their own endpoints but the key format is the same.
Module 1 · Section 2

Authentication & Security

API keys, Managed Identity, Azure Key Vault, and RBAC for AI services.

🔑
Authentication Methods

Key-Based Auth
  • Two keys per resource (rotate without downtime)
  • Store in Azure Key Vault, not in code
  • Passed as Ocp-Apim-Subscription-Key header
  • Easy but requires key management
Managed Identity (Recommended)
  • No credentials to manage or rotate
  • Assign Cognitive Services User RBAC role
  • Works with VMs, App Service, AKS, Functions
  • Use DefaultAzureCredential() in SDK
Never hard-code keys! Use environment variables or Azure Key Vault. Key Vault references in App Service settings are the standard pattern: @Microsoft.KeyVault(SecretUri=...)

💻
Python Authentication Examples

Python
# ── Option 1: Key-Based Authentication ──
from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from msrest.authentication import CognitiveServicesCredentials

client = ComputerVisionClient(
    endpoint="https://my-vision.cognitiveservices.azure.com/",
    credentials=CognitiveServicesCredentials("YOUR_KEY")
)

# ── Option 2: Managed Identity (PRODUCTION BEST PRACTICE) ──
from azure.identity import DefaultAzureCredential
from azure.ai.textanalytics import TextAnalyticsClient
from azure.core.credentials import TokenCredential

credential = DefaultAzureCredential()  # auto-picks env/MI/CLI
client = TextAnalyticsClient(
    endpoint="https://my-language.cognitiveservices.azure.com/",
    credential=credential
)

# ── Key Vault integration (retrieve key at runtime) ──
from azure.keyvault.secrets import SecretClient

kv_client = SecretClient(
    vault_url="https://my-vault.vault.azure.net/",
    credential=DefaultAzureCredential()
)
api_key = kv_client.get_secret("vision-api-key").value

👥
RBAC Roles for AI Services

RoleCan Read Keys?Can Call APIs?Can Manage Resource?
Owner
Contributor✅ (no RBAC assignments)
Cognitive Services Contributor
Cognitive Services User✅ (Entra ID only)
Reader
Exam tip: Cognitive Services User enables calling APIs via Azure AD tokens but cannot read subscription keys. Use this for Managed Identity scenarios.
Module 1 · Section 3

Network Security

Private endpoints, VNet service endpoints, firewalls, and customer-managed keys.

🔒
Network Isolation Options

  • Default behavior: AI Services accept traffic from any IP (public endpoint)
  • IP Firewall rules: restrict to specific public IP ranges — still uses public endpoint
  • VNet Service Endpoints: route traffic through Azure backbone; endpoint remains public but only accessible from specified subnets
  • Private Endpoint (most secure): assigns a private IP inside your VNet — no public exposure
  • DNS: privatelink.cognitiveservices.azure.com resolves to private IP
  • After creating private endpoint, disable public access entirely
  • Requires Azure Private DNS Zone for name resolution
For maximum isolation: Create Private Endpoint → Configure Private DNS Zone → Set "Allow access from: Disabled" for public network access in the Networking blade.

🗝️
Customer-Managed Keys (CMK)

  • Default: Microsoft manages encryption keys (platform-managed) — you cannot revoke
  • CMK: your organization's key stored in Azure Key Vault encrypts data-at-rest
  • Revocation: disable or delete the key in Key Vault → service data becomes immediately inaccessible
  • Requirements: Standard pricing tier + Azure Key Vault with soft-delete + purge protection enabled
  • Setup: Key Vault → Grant AI Service identity access → Configure CMK in AI resource settings
Module 1 · Section 4

Monitoring & Diagnostics

Azure Monitor, diagnostic logs, alerts, and Application Insights for AI services.

📈
Azure Monitor Integration

  • Metrics (built-in, no config): TotalCalls, TotalErrors, Latency, SuccessRate
  • Diagnostic Logs: enable via Portal → Monitoring → Diagnostic settings
  • Send to: Log Analytics Workspace (KQL queries), Storage Account (archive), Event Hub (stream)
  • Alerts: trigger on metric thresholds — e.g., "TotalErrors > 100 in 5 min"
  • Application Insights: distributed tracing for SDK-based apps — trace IDs, spans, custom telemetry

🔍
Useful KQL Queries

KQL — Log Analytics
// All AI Service diagnostic logs
AzureDiagnostics
| where ResourceType == "COGNITIVESERVICES"
| project TimeGenerated, OperationName, ResultType, DurationMs

// Error rate over time
AzureMetrics
| where MetricName == "TotalErrors"
| summarize sum(Total) by bin(TimeGenerated, 5m)
| render timechart

// Latency P95
AzureMetrics
| where MetricName == "Latency"
| summarize percentile(Average, 95) by bin(TimeGenerated, 1h)
Module 1 · Section 5

Responsible AI

Microsoft's six principles, Content Safety service, and Limited Access features.

⚖️
Microsoft's 6 Responsible AI Principles

PrincipleDescription
FairnessAI systems should treat all people fairly, without discriminating by race, gender, age, etc.
Reliability & SafetyAI must perform reliably and safely, even in unexpected conditions
Privacy & SecurityAI must respect privacy and protect user data
InclusivenessAI should empower everyone, including people with disabilities
TransparencyAI systems should be understandable; humans should be able to explain AI decisions
AccountabilityHumans must be accountable for AI systems and their impacts

🛡️
Azure AI Content Safety

  • Dedicated service for detecting harmful content in text AND images
  • 4 harm categories: Hate, Sexual, Violence, Self-harm
  • Severity levels: 0 (safe) to 6 (very harmful)
  • Prompt Shield: detect jailbreak attempts and indirect prompt injection
  • Groundedness detection: check if AI response is supported by retrieved context (RAG)
  • Protected material detection: identify copyrighted text/code in model output
Limited Access Features require Microsoft application/approval: Face API identification & verification, Custom Neural Voice, Speaker Recognition. These cannot be used without an approved use case.
Module 2 · Section 1

Azure AI Vision

Image Analysis 4.0, OCR, dense captions, smart crops, and object detection.

👁️
Image Analysis Features

Visual Features
  • Caption — single natural language description
  • Dense Captions — captions for image regions
  • Tags — confidence-scored keywords
  • Objects — bounding boxes + labels
  • Read (OCR) — extract printed/handwritten text
  • Smart Crops — thumbnail crop regions
  • People — detect people locations
API Details
  • API Version: 2023-10-01 (Image Analysis 4.0)
  • Max file size: 4MB
  • Formats: JPEG, PNG, BMP, GIF, TIFF, WEBP
  • Input: image URL or binary body
  • Features passed as query param: ?features=Caption,Tags,Read

💻
Code Sample — Analyze Image

Python
from azure.ai.vision.imageanalysis import ImageAnalysisClient
from azure.ai.vision.imageanalysis.models import VisualFeatures
from azure.core.credentials import AzureKeyCredential

client = ImageAnalysisClient(
    endpoint="https://my-vision.cognitiveservices.azure.com/",
    credential=AzureKeyCredential("YOUR_KEY")
)

result = client.analyze_from_url(
    image_url="https://aka.ms/azsdk/image-analysis/sample.jpg",
    visual_features=[
        VisualFeatures.CAPTION,        # "a dog running on grass"
        VisualFeatures.TAGS,           # dog, outdoor, green, fun
        VisualFeatures.READ,           # OCR text extraction
        VisualFeatures.DENSE_CAPTIONS, # per-region captions
        VisualFeatures.OBJECTS,        # bounding boxes
    ],
    gender_neutral_caption=True,
)

# Results
print(f"Caption: {result.caption.text} ({result.caption.confidence:.2f})")
for tag in result.tags.list:
    print(f"  Tag: {tag.name} ({tag.confidence:.2f})")
for block in result.read.blocks:
    for line in block.lines:
        print(f"  OCR: {line.text}")
Module 2 · Section 2

Custom Vision

Train your own image classifiers and object detectors on custom categories.

🎯
Custom Vision Overview

  • Two project types: Classification and Object Detection
  • Classification: assign one (multi-class) or many (multi-label) categories per image
  • Object Detection: locate and label objects with bounding boxes
  • Minimum images: 15 per tag for classification; 15 labelled regions for detection
  • Training modes: Quick Training (fast, less accurate) vs Advanced Training (slower, better)
  • Domain types: determine export options
  • General (Compact): exportable to CoreML, TensorFlow, ONNX, Docker (edge/offline)
  • General (S1/A2): cloud API only — cannot be exported
  • Portal: customvision.ai

📤
Export Formats for Edge Deployment

PlatformFormatUse Case
iOS / macOSCoreMLiPhone, iPad, Mac apps
Android / TF LiteTensorFlow LiteAndroid, Raspberry Pi
Windows / Cross-platformONNXWindows ML, ONNX Runtime
Intel hardwareOpenVINOIntel Neural Compute Stick
Any container hostDocker containerIoT Edge, AKS, on-prem
You must select a Compact domain BEFORE training if you want to export the model. You cannot export models trained with standard domains.
Module 2 · Section 3

Face API

Face detection, verification, identification, and responsible use requirements.

🧑
Face API Operations

OperationDescriptionTypeAccess
DetectLocate faces + attributes (age, emotion, blur, headpose)1:0Open
VerifyAre two faces the same person? Returns Boolean + confidence1:1Limited
IdentifyWho is this face? Search against a PersonGroup1:NLimited
Find SimilarFind faces that look similar in a FaceList1:NLimited
GroupCluster unknown faces into groupsN:NOpen
LivenessAnti-spoofing — real person vs photo/videoOpen
Limited Access: Identify, Verify, Find Similar require Microsoft application approval. Cannot be used for surveillance without explicit consent. Emotion detection results should not be used in clinical/legal decisions.

💻
Face Identification Flow

Python — Face Identify Workflow
# 1. Create PersonGroup
face_client.person_group.create(person_group_id="employees", name="Employees")

# 2. Add persons and their face images
person = face_client.person_group_person.create("employees", name="Alice")
face_client.person_group_person.add_face_from_url(
    "employees", person.person_id, "https://example.com/alice.jpg"
)

# 3. Train the PersonGroup
face_client.person_group.train("employees")

# 4. Detect faces in a new photo
detected = face_client.face.detect_with_url("https://example.com/newphoto.jpg")
face_ids = [f.face_id for f in detected]

# 5. Identify against the group
results = face_client.face.identify(face_ids, person_group_id="employees")
for r in results:
    if r.candidates:
        print(f"Identified: {r.candidates[0].person_id} ({r.candidates[0].confidence:.2f})")
Exam: The latest recognition model is recognition_04 — use it for best accuracy. Always retrain PersonGroup after adding new face images.
Module 2 · Section 4

Video Indexer

AI-powered video analysis for transcription, faces, topics, brands, and more.

🎬
Video Indexer Insights

Audio Insights
  • Transcript (STT with timestamps)
  • Speaker diarization (who said what)
  • Keywords extracted from speech
  • Topics (Wikipedia-linked)
  • Emotions (joy, sadness, anger, fear)
Visual Insights
  • Faces (detect + identify)
  • Labels (objects/scenes)
  • Brands (logos, product names)
  • OCR (on-screen text)
  • Scenes & shots (visual segmentation)
  • Account types: Trial (limited uploads, free) vs. Paid (connected to Azure Media Services)
  • Indexing presets: Basic (audio only), Standard (audio+video), Advanced (+ deep analysis)
  • Widgets: embed Player Widget + Insights Widget in any web app via iframe
  • Access: videoindexer.ai portal or REST API with access token
Exam: If the question is about making video content searchable by spoken words → Video Indexer (transcript). If the question is about counting people in a zone or tracking movements → Azure AI Vision Spatial Analysis.
Module 3 · Section 1

Azure AI Language

Sentiment, NER, PII, key phrases, language detection — pre-built and custom.

📝
Pre-built Language Features

FeatureWhat It DoesKey Parameter
Sentiment AnalysisPositive / Negative / Neutral + Mixed + Opinion Miningshow_opinion_mining=True
Key Phrase ExtractionExtract important topics from text
NERPersons, locations, dates, orgs, phone numbers
PII DetectionIdentify & redact personal info (SSN, email, credit card)Categories filter
Language DetectionIdentify language + ISO code + confidence
Text SummarizationAbstractive or extractive summarysummary_type

💻
Code: Sentiment + Opinion Mining

Python
from azure.ai.textanalytics import TextAnalyticsClient
from azure.core.credentials import AzureKeyCredential

client = TextAnalyticsClient(
    endpoint="https://my-language.cognitiveservices.azure.com/",
    credential=AzureKeyCredential("KEY")
)

docs = ["Azure is powerful but the pricing is confusing."]

# Opinion mining reveals aspect-level sentiment
results = client.analyze_sentiment(docs, show_opinion_mining=True)
for doc in results:
    print(f"Overall: {doc.sentiment}")
    for sentence in doc.sentences:
        for opinion in sentence.mined_opinions:
            target = opinion.target
            print(f"  Target: {target.text} → {target.sentiment}")
            for assessment in opinion.assessments:
                print(f"    Assessment: {assessment.text} ({assessment.sentiment})")

# Output:
# Overall: mixed
# Target: Azure → positive  Assessment: powerful (positive)
# Target: pricing → negative  Assessment: confusing (negative)
Module 3 · Section 2

Question Answering & CLU

Custom Q&A knowledge bases, Conversational Language Understanding, and Orchestration Workflow.

Question Answering

  • Successor to QnA Maker — part of Azure AI Language
  • Create a knowledge base from: FAQ URLs, PDF/Word docs, or manual QA pairs
  • Custom QA — train on your content; Prebuilt QA — no project needed
  • Confidence score threshold: default 0, recommend setting ≥0.5 in production
  • Multi-turn conversations: define follow-up prompts for clarification flows
  • Chitchat: add personality (professional, friendly, witty) for small talk
  • Active learning: uses low-confidence user queries to suggest new QA pairs

🗣️
Conversational Language Understanding (CLU)

  • Successor to LUIS — same concept, improved model
  • Intents = what the user wants (e.g., BookFlight, CheckWeather)
  • Entities = key info extracted (e.g., City, Date, FlightClass)
  • Minimum utterances per intent: 5 recommended (15–30 for production)
  • Orchestration Workflow — meta-project that routes to CLU, QA, or LUIS based on input
  • Deploy to Language endpoint → call via REST or SDK with analyze_conversation()
Exam: "single app routing between intent recognizer and FAQ" → Orchestration Workflow. "extract what user wants and key data points" → CLU. "FAQ-style answers" → Question Answering.
Module 3 · Section 3

Azure AI Speech

STT, TTS, SSML, Custom Speech, Speaker Diarization, and Speech Translation.

🎤
Speech-to-Text (STT)

  • Real-time recognition: live microphone or streaming audio
  • Batch transcription: async processing of large audio files (Azure Blob) — use for 500+ hours
  • Custom Speech: upload audio + transcripts to improve accuracy for domain-specific vocabulary
  • Speaker Diarization: labels utterances by speaker — "who said what" — enabled via diarization_config
  • Keyword Recognition: detect specific wake words locally without cloud calls

🔊
Text-to-Speech (TTS) & SSML

  • 400+ voices, 140+ languages. Neural voices = highly natural speech
  • Custom Neural Voice: clone a voice from 300+ recordings — Limited Access
  • SSML (Speech Synthesis Markup Language) — control output precisely:
SSML — Speech Synthesis Markup
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US">
  <voice name="en-US-JennyNeural">
    <prosody rate="slow" pitch="+10%">
      Welcome to the Azure AI course.
    </prosody>
    <break time="2s"/>
    <emphasis level="strong">This is important!</emphasis>
  </voice>
</speak>
For simultaneous speech recognition + translation: use TranslationRecognizer with SpeechTranslationConfig — not SpeechRecognizer.
Module 3 · Section 4

Azure AI Translator

Neural machine translation, transliteration, language detection, and custom models.

🌍
Translator Key Facts

  • Neural Machine Translation for 100+ languages
  • Global endpoint: api.cognitive.microsofttranslator.com — NOT regional!
  • Required headers: Ocp-Apim-Subscription-Key AND Ocp-Apim-Subscription-Region
  • Operations: Translate, Transliterate, Detect, Dictionary Lookup, Dictionary Examples, BreakSentence
  • Custom Translator: upload parallel bilingual documents → domain-specific model
  • Document Translation: async translation of entire files (PDF, Word, PowerPoint) maintaining formatting
Python — Translate Text
import requests, uuid

url = "https://api.cognitive.microsofttranslator.com/translate"
params = {"api-version": "3.0", "from": "en", "to": ["fr", "ar"]}
headers = {
    "Ocp-Apim-Subscription-Key": "YOUR_KEY",
    "Ocp-Apim-Subscription-Region": "eastus",  # REQUIRED!
    "Content-Type": "application/json",
    "X-ClientTraceId": str(uuid.uuid4())
}
body = [{"text": "Azure AI is amazing!"}]

response = requests.post(url, params=params, headers=headers, json=body)
print(response.json())
# [{"translations": [{"text": "Azure AI est incroyable!", "to": "fr"}, ...]}]
The Ocp-Apim-Subscription-Region header is mandatory for Translator when using a multi-service resource key. Omitting it causes a 401 error. Azure AI Language does NOT require this header.
Module 4 · Section 1

Azure AI Search

Indexes, indexers, data sources, query types — simple, semantic, and vector.

🔎
Azure AI Search Architecture

Core Components
  • Index — schema: fields with types + attributes
  • Data Source — Blob, SQL, Cosmos DB, SharePoint
  • Indexer — automated pipeline, supports scheduling
  • Skillset — AI enrichment pipeline (OCR, NLP)
  • Knowledge Store — persist enrichments to Storage
Field Attributes
  • searchable — full-text indexed
  • filterable — use in $filter
  • sortable — use in $orderby
  • facetable — aggregate counts
  • retrievable — returned in results
  • key — unique document identifier

🔬
Query Types

TypeHow It WorksWhen to Use
SimpleBasic keyword matching with wildcardsSimple text search
Full LuceneRegex, fuzzy (~), proximity, boostingAdvanced keyword search
SemanticRe-ranks results using language models; extracts captions + answersNatural language queries
VectorANN search on embedding vectors (HNSW algorithm)Semantic similarity / RAG
HybridCombines keyword + vector via Reciprocal Rank FusionBest overall recall
Exam: HNSW = Hierarchical Navigable Small World = ANN vector search algorithm (approximate, fast). EKNN = Exhaustive KNN = exact but slow. Use HNSW for production scale.
Module 4 · Section 2

AI Enrichment & Knowledge Store

Skillsets, built-in skills, custom WebApiSkill, and persisting enrichments.

⚙️
Built-in Skills

Text Skills
  • LanguageDetectionSkill
  • KeyPhraseExtractionSkill
  • EntityRecognitionSkill
  • SentimentSkill
  • SplitSkill (chunk documents)
  • MergeSkill (combine fields)
  • TextTranslationSkill
Vision + Custom Skills
  • OcrSkill (extract text from images)
  • ImageAnalysisSkill (tags, captions)
  • ShaperSkill (reshape into complex type)
  • WebApiSkill — call any Azure Function or HTTP endpoint
  • AzureMachineLearningSkill
Custom Skill Pattern: WebApiSkill calls an Azure Function that accepts a JSON body with a values array, processes each record, and returns a values array with enriched output. This lets you integrate any ML model.

💾
Knowledge Store

  • Persists AI enrichments from indexing to Azure Storage (outside the search index)
  • Three projection types:
  • Tables → Azure Table Storage (rows per document/chunk)
  • Objects → Azure Blob as JSON files
  • Files → Azure Blob as normalized images
  • Used for: Power BI analysis, downstream ML pipelines, audit trails
  • generatedKeyName defines the RowKey field in table projections
Module 4 · Section 3

Document Intelligence

Pre-built models, custom template/neural models, composed models, and Layout API.

📄
Pre-built Models

ModelWhat It Extracts
ReadOCR — all text with language detection, no semantic fields
LayoutStructure: paragraphs, tables, headings, selection marks (checkboxes)
InvoiceVendorName, InvoiceTotal, LineItems, Tax, DueDate, etc.
ReceiptMerchantName, TransactionDate, Total, Items
ID DocumentName, DOB, address, document number from IDs/passports
Business CardName, company, email, phone, address
W-2 / 1040US tax form fields

🎓
Custom Models

  • Custom Template: fixed-layout forms (consistent field positions) — minimum 5 labelled samples
  • Custom Neural: variable-layout forms — requires ~15+ labelled samples, handles diverse layouts
  • Composed Model: bundle multiple custom models under one model ID — service picks the best match
  • Label using the Document Intelligence Studio: documentintelligence.ai.azure.com
  • selectionMarks output = checkboxes and radio buttons with selected/unselected state
Exam: "200 PDFs with same layout" → Custom Template. "Variable layout invoices" → Custom Neural. "Generic invoice extraction without training" → Pre-built Invoice model.
Module 5 · Section 1

Azure OpenAI Service

Models, deployments, completions API, parameters, and content filtering.

🤖
Azure OpenAI Fundamentals

Available Models
  • GPT-4o, GPT-4, GPT-4-Turbo (chat + vision)
  • GPT-3.5-Turbo (fast, economical)
  • text-embedding-3-small / large, ada-002
  • DALL-E 3 (image generation)
  • Whisper (speech-to-text)
Key Concepts
  • Requires application approval to access
  • You deploy a model → give it a deployment name
  • API calls use deployment name, NOT model name
  • Quota measured in TPM (tokens per minute)
  • Content filtering built-in (4 harm categories)

💻
Chat Completions API

Python — Azure OpenAI
from openai import AzureOpenAI

client = AzureOpenAI(
    azure_endpoint="https://my-openai.openai.azure.com/",
    api_key="YOUR_API_KEY",
    api_version="2024-02-01"
)

response = client.chat.completions.create(
    model="gpt4o-prod",  # YOUR deployment name, not "gpt-4o"!
    messages=[
        {"role": "system",  "content": "You are a helpful Azure AI expert."},
        {"role": "user",    "content": "Explain Azure AI Search in 3 points."}
    ],
    temperature=0.3,    # 0=deterministic, 1=creative
    max_tokens=500,
    top_p=0.9,
)

print(response.choices[0].message.content)
print(f"Tokens used: {response.usage.total_tokens}")
print(f"Finish reason: {response.choices[0].finish_reason}")
# finish_reason: "stop" | "length" | "tool_calls" | "content_filter"
Module 5 · Section 2

Prompt Engineering

Zero-shot, few-shot, CoT, system prompts, and parameter tuning.

✏️
Prompting Techniques

TechniqueDescriptionBest For
Zero-shotNo examples in prompt — direct instruction onlySimple, well-defined tasks
Few-shot2–5 input→output examples before the actual inputFormatting, classification
Chain-of-ThoughtAdd "Think step by step" — model explains reasoningMath, logic, complex decisions
System PromptSet persona, constraints, output format, safety rulesAll production applications
GroundingInclude retrieved context; instruct "only use provided context"RAG — reduce hallucinations

🎛️
Parameter Reference

ParameterRangeEffect
temperature0–20=deterministic/factual, 1=balanced, 2=creative/random
top_p0–1Nucleus sampling — 0.1=top 10% tokens only, 1=all tokens
max_tokens1–model maxMaximum response length (not input length)
frequency_penalty-2 to 2Reduce repetition of frequent tokens
presence_penalty-2 to 2Encourage topic diversity (new subjects)
stopstring listStop generation at specific tokens
Don't set both temperature and top_p at non-default values simultaneously — they interact. Pick one to control diversity.
Module 5 · Section 3

RAG Architecture

Retrieval-Augmented Generation on Azure — embeddings, vector search, and grounding.

🗃️
RAG Pipeline on Azure

  • Step 1 — Ingest: chunk documents into ~512 token segments → generate embeddings via text-embedding-3-small
  • Step 2 — Store: vector index in Azure AI Search (HNSW) or Cosmos DB for MongoDB vCore
  • Step 3 — Retrieve: embed user query → ANN search → top-K chunks
  • Step 4 — Generate: inject chunks into prompt → GPT-4o generates grounded answer
  • System prompt: "Only answer from the provided context. Say 'I don't know' if not found."
Python — RAG with Azure OpenAI + AI Search
# 1. Generate embedding for the user query
embedding_response = client.embeddings.create(
    model="text-embedding-3-small",  # deployment name
    input=user_query
)
query_vector = embedding_response.data[0].embedding

# 2. Vector search in AI Search
from azure.search.documents.models import VectorizedQuery
results = search_client.search(
    search_text=None,
    vector_queries=[VectorizedQuery(
        vector=query_vector, k_nearest_neighbors=5,
        fields="content_vector"
    )]
)
chunks = [r["content"] for r in results]

# 3. Generate grounded answer
context = "\n\n".join(chunks)
response = client.chat.completions.create(
    model="gpt4o-prod",
    messages=[
        {"role": "system", "content": f"Answer ONLY from:\n\n{context}"},
        {"role": "user",   "content": user_query}
    ]
)
Module 5 · Section 4

DALL-E 3 & Content Safety

Image generation, content filters, prompt shields, and groundedness detection.

🎨
DALL-E 3 Image Generation

  • Resolutions: 1024×1024, 1024×1792 (portrait), 1792×1024 (landscape)
  • Style: vivid (hyperrealistic) or natural (softer, less saturated)
  • Quality: standard or hd (more detail, higher cost)
  • Response includes revised_prompt — DALL-E may rewrite for safety/quality
  • Image URL expires after 1 hour — download and store if needed
Python — DALL-E 3
result = client.images.generate(
    model="dall-e-3",   # your deployment name
    prompt="A futuristic Azure data center glowing blue in space",
    n=1,
    size="1792x1024",
    quality="hd",
    style="vivid"
)
print(result.data[0].url)            # temporary URL (1 hr expiry)
print(result.data[0].revised_prompt)  # what DALL-E actually used

🛡️
Azure AI Content Safety Features

FeatureWhat It Detects
Text/Image ModerationHate, Sexual, Violence, Self-harm (severity 0–6)
Prompt ShieldJailbreak attacks, indirect prompt injection
Groundedness DetectionAI response contradicts or ignores retrieved context (RAG)
Protected MaterialCopyrighted text or code in model output
Custom CategoriesDefine your own harm categories with examples